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ABSTRACT 


'in  this  paper  we  propose  and  develop  techniques  for  solving  structured, 
large-scale  convex  programming  problems*  The  procedure  is  a  combination  of  a 
decomposition  technique  of  Dantzig-Wolfe  type  and  the  proximal  point  method. 
The  proximal  point  method  is  used  to  overcome  the  drawbacks  of  the 
decomposition  technique. 

The  procedure  is  then  used  to  solve  block  angular  linear  programming 
problems.  By  exploiting  the  linearity  of  the  problem  we  have  several  variants 
of  the  procedure.  U 
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□  □ 


SIGNIFICANCE  AND  EXPLANATION 


Optimization  problems  which  arise  from  real-world  situations  are  usually 
complex  and  large.  It  is  almost  impossible  to  use  general  methods  to  solve 
these  problems.  Fortunately*  these  problems  always  have  certain  special 
features*  and  by  exploiting  these  special  features  we  can  obtain  an  optimal 
solution  for  the  problem.  The  decomposition  method  is  one  of  the  solution 
techniques  that  make  use  of  the  special  structure  of  the  problem.  The  idea  of 
the  decomposition  method  is  to  break  up  a  problem  into  a  sequence  of  simpler 
subproblems*  and  then  obtain  the  optimal  solution  for  the  problem  from  the 
optimal  solutions  of  the  subproblems. 

In  this  paper  we  discuss  a  decomposition  technique  and  its  disadvantages* 
and  -then  propose  to  use  the  proximal  point  method  to  overcome  these 
disadvantages.  Me  also  apply  the  proposed  procedure  to  solve  block  angular 
linear  programming  problems.  An  important  application  of  block  angular  linear 
programming  is  the  problem  of  a  multidivisional  organization t  The  divisions 
of  the  organisation  operate  almost  independently  of  each  other;  they  are  only 
coupled  by  the  fact  that  they  share  a  few  scarce  resources.  The  decomposition 
method  seeu  to  be  an  appropriate  approach  for  solving  this  type  of  problem. 


The  responsibility  for  the  wording  and  views  expressed  in  this  descriptive 
summary  lies  with  MAC,  and  not  with  the  author  of  this  report. 
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A  DECOMPOSITION  METHOD  AND  ITS  APPLICATION 


TO  BLOCK  ANGULAR  LINEAR  PROGRAMS 


CTJ  DUONG  HA1 


1 .  Introduction . 

A  general  mathematical  programming  problem  with  more  than  a  few  hundred 
variables  is  not  easy  to  solve,  especially  in  nonlinear  cases*  Fortunately, 
large  problems,  which  arise  from  practical  applications,  nearly  always  have 
some  special  features;  for  example,  sparseness  of  the  matrix  of  constraints  or 
block  angular  structure  of  the  problem.  In  general,  it  is  only  by  exploiting 
these  special  structures  that  we  can  solve  large-scale  problems. 

The  idea  of  decomposing  a  problem  into  a  collection  of  simpler  problems 
is  not  new  in  mathematics.  However,  only  in  recent  years  have  decomposition 
techniques  been  extensively  developed  for  mathematical  programming,  primarily 
to  take  advantage  of  powerful  computer  systems.  Decomposition  approaches  are 
essentially  suitable  for  systems  composed  of  several  smaller  subsystems  which 
are  independent  of  each  other  except  for  a  few  weak  connections  between 
them.  It  seems  natural,  then,  to  break  up  these  weak  interconnections  and 
handle  each  subsystem  independently,  obtaining  the  solution  for  the  overall 
system  from  the  solutions  of  the  subsystem. 
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In  Section  2  we  discuss  the  underlying  idea  of  a  decomposition 
technique  of  Dantzig-Holfe  type  applied  to  problems  of  the  form 

m  m 

inf  {  J  f. (*. )|  J  A.*.  *  a}  , 
i-1  1  1  i-1  X  1 

where,  for  fixed  i  ,  is  a  vector,  is  a  matrix,  and  f^  is 

a  convex  function  having  values  in  the  extended  real  line  (-°®,+°0  . 
This  decomposition  technique  is  very  interesting  but  it  has  several 
drawbacks  in  some  important  cases.  To  overcome  these  drawbacks  we 
combine  the  proximal  point  method  with  the  decomposition  technique 
and  propose  a  new  decomposition  procedure  to  solve  convex,  structured 
large-scale  problems. 

Section  3  deals  with  the  application  of  the  general  procedure 
in  Section  2  to  block  angular  linear  programming  problems.  An  example 
of  this  type  of  problem  is  the  problem  of  a  multidivisional  organiza¬ 
tion.  Each  division  is  operated  independently  from  the  others;  the 
only  linkage  between  them  is  that  they  all  share  a  few  scarce  re¬ 
sources.  By  exploiting  the  linearity  of  the  constraints  we  can 
modify  the  general  algorithm  and  have  specific  algorithms  for  block 
angular  linear  programming  problems.  Computational  aspects  of  the 
algorithms  are  reported;  they  show  that,  for  large  problems,  our 
procedure  is  better  than  some  other  methods. 


In  this  section  we  shall  present  a  new  approach  to  solved 


structured  convex  programming  problems.  This  approach  is  a  combina¬ 
tion  of  a  decomposition  technique  of  Dantzig-WOlfe  type  and  the  proximal 
point  method. 

2.1  The  decomposition  technique  of  Dantzig-Wolfe  type. 

He  consider  the  problem 

m  m 

inf  {  £  f.(x.)|  l  A.x.-a}  (2.1) 

i-1  1  1  i-1  1  1 

n.  n.- 

where  x^  e  B.  ,  f .  is  a  closed  proper  convex  function  from  It  1 

i  B 

to  («■•,♦«],  A.  is  l  x  n.  matrix,  and  a  e  K  .  Let  n  •  J  n. 

11  i-1  1 

and  x:  -  (x^Xj . *m)» 

Note  that  we  allow  f^  to  have  the  value  +“.  In  that  way  we 
can  incorporate  constraints  into  the  objective  function. 

first  we  need  several  definitions.  Let  f  be  a  convex  func¬ 
tion  defined  on  a  subset  S  of  Rn  and  having  values  on  [-*,+«] . 
The  set 

{(x,y)|xeS,  yeR,  y>f(x)} 

is  called  the  epigraph  of  f  and  is  denoted  by  epi  f.  The  effective 
domain  of  f,  denoted  by  dom  f,  is  defined  by 

dom  f:  -  (xj9y,  (x,y)  e  epi  f)  ; 

we  can  easily  see  that 

dom  f  -  (x|f(x)  <  •}  . 

The  conjugate  function  f*  of  f  is  defined  by 
f*(p):  -  sup  (<p,x>  -  f(x)} 


hi*-:- : 


where  <p,x>  is  the  inner  product  of  vectors  p  e  Rn  and  x  e  Hn  . 

A  convex  function  is  said  to  be  closed  if  its  epigraph  is  a 
closed  set  in  Rn+^. 

A  convex  function  f  is  said  to  be  proper  if  f(x)  <  00  for 
at  least  one  x  and  f(x)  >  for  every  x  . 

It  is  easy  to  show  that  the  conjugate  function  f*  of  any 
convex  function  f  is  a  closed  convex  function  and  f*  is  proper 
if  and  only  if  f  is  proper.  Now  we  return  to  our  problem  (2.1). 

The  linear  constraints 

AjXj.  +  A2x2+.  *  ‘*AmXm  *  a 

are  called  coupling  constraints.  They  interconnect  m  subproblems; 
each  has  its  own  set  of  variables  and  constraints.  Without  the 
coupling  constraints,  the  problem  (2.1)  could  be  solved  easily  by 
solving  separately  m  small  subproblems  inf  f^(x^)  . 

The  decomposition  techniques  for  (2.1)  are  methods  used  to 
break  up  the  coupling  constraints,  so  that  the  resulting  problem 

t 

can  be  separated  into  smaller  subproblems.  One  way  to  achieve  that 
purpose  is  to  use  the  following  well-known  results  of  convex  analysis. 

Theorem  2.1 

Under  the  conditions 

im  A?  H  ri  dom  ft  $  i  ■  1 , 2 , . . .  ,m 


(2.2) 


'TT  ■ 


where  im  A^  is  the  image  of  IR^  under  A^  and  ri  dom  f  *  is  the 
interior  of  the  effective  domain  of  f J  , 


the  infimua  in  (2.1)  is  equal  to 


sup  {(a,y)  -  l  *(A*y)>  . 


(2.3) 


It  is  attained  for  some  x  ■  (x. ,...,x  )  if  the  problem  (2.1) 

1  01 

is  feasible  and  is  +«°  if  (2.1)  is  infeasible. 

(For  a  proof  see  [Rockafellar,, 1970,  page  142].) 


Define 


g(y) :  ■  <a.y>  “  1  f^UTy) 
i-1  1  1 


(2.4) 


Then  (2.3)  can  be  rewritten  as 


sup  g(y)  . 

y 


(2.5) 


Rote  that  the  dimension  £  of  y  is  usually  much  smaller  than  n  , 
the  number  of  variables  of  (2.1).  Given  a  value  of  y  ,  we  have  to 
compute 


f?(A?y)  ■  sup  {(A?y,x^)  -  f^(x^)} 


(2.6) 


for  i  *  1,2,..., m;  which  are  m  small  subproblems.  The  scheme  to 
solve  (2.1)  using  the  above  results  is:  choose  a  value  of  y  and 
solve  m  subproblems  (2.6)  corresponding  to  that  fixed  y  .  If  y 
is  an  optimal  solution  for  (2.5),  stop.  Otherwise  update  y  and 
solve  (2.6)  again.  Repeat  until  an  optimal  solution  of  (2.5)  is 
obtained. 
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Using  this  scheme,  we  have  successfully  decomposed  the  problem 
m 

(2.1)  with  n  (*  [  n.)  variables  into  a  sequence  of  small  sub- 

i-1  1 

problems  having  n^Ci^l, . . .  »tn)  and  f,  variables.  Because  of  the 
following  interpretations,  the  dual  problem  (2.5)  is  usually  called 
the  master  program  and  the  decomposition  technique  above,  the  price- 
directive  approach.  Set  prices  (values  of  y)  on  the  common  resources 
(coupling  constraints)  and  add  the  costs  of  these  resources  to  the 
objective  function  of  each  subproblem  (terms  (A.y,x^)  in  (2.6)).  By 
appropriately  varying  these  prices,  one  can  cause  the  subproblems 
to  produce  solutions  which  yield  an  optimal  solution  for  the  original 
problem.  A  diagram  for  the  scheme  is  shown  below. 


Master  Program 


subproblem  subproblem 


Figure  2.1  A  decomposition  scheme  for  large-scale  systems. 

How,  let  us  examine  the  conditions  (2.2)  to  see  if  they  are 
reasonable.  First,  if  ft  are  polyhedral  (i.e.  the  epigraphs  of 
ft  are  polyhedral  convex  sets),  then  the  conditions  (2.2)  can  be 
weakened  to 

(im  aT)  H  (dom  ft)  #  4>  i  *  l,2,...,m  . 
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Second,  consider  the  general  linear  programming  problem 


Its  dual  problem  is 


min  (c,x) 
s.t.  Ax  ■  a 
x  £  0  . 


max  (a,y) 
.T 


s.t.  A  y  <  c  . 

The  function  f  in  this  case  is  defined  by 


f(x): 


<c,x>  if  x>0 
♦  •  otherwise, 


then  the  conjugate  function  f*  of  f  is 


f*(ATy) 


sup  {(A  y,x)  -  f(x)} 
0  if  ATy  <  c 
+  ®  otherwise  . 


Thus  the  condition  (2.2)  in  the  case  of  linear  programming  is  the 

requirement  that  the  feasible  region  of  the  dual  problem  is  nonempty. 

n. 

Finally,  if  fj^p^)  f*-nite  for  every  p^  in  H  1  (it  holds  in 

our  applications),  then  (2.2)  are  satisfied  trivially  because  the 

n. 

effective  domain  of  ft  is  the  entire  space  K  1  . 

The  decomposition  technique  described  above  is  very  elegant,  but 
in  applications  there  are  several  drawbacks: 
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(i)  The  function  g(y)  defined  by  (2.4)  may  not  be  finite  for 
some  y  .  The  set  Y:  =  {y|-°°<g(y)}  may  be  considered  as 
the  feasible  region  for  the  problem  (2.5).  But  Y  does  not 
have  an  explicit  form  because  it  is  defined  by  m  maximiza¬ 
tion  problems  (2.6).  (y  e  Y  if  and  only  if  (ft(ATy)<08  for 
all  i)). 

(ii)  g(y)  may  not  be  differentiable  everywhere  in  the  set  Y  . 
Shapiro  has  a  paper  [1978]  discussing  this  point  in  more 
detail. 

(iii)  Given  a  y,  the  corresponding  subproblems  (2.6)  may  not  have 
unique  optimal  solutions.  Consequently,  arbitrary  optimal 
solutions  x^(y*)  of  (2.6)  corresponding  to  the  optimal  solu¬ 
tion  y*  of  (2.5)  may  not  aggregate  to  form  an  optimal  solu¬ 
tion  for  the  original  problem  (2.1)  (although  some  such  solu¬ 
tion  will  do  so). 

We  illustrate  those  points  by  a  simple  example: 


min  x^  +  2x^ 


subject  to  x. 


<  2 


*2^  3 
*1  ♦  x2  -  4 

Xj  unrestricted,  x2  >,  0  . 
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In  this  example 


*1  * 1 

W 


f*(p2> 


The  images  of  A: 
satisfied  trivia 


and  are  real  numl 

\  if  £  2 

f  (x  )  m  ' 

W 

+  ®  otherwise 
'2x2  if  0  <  x2 

W  -  ■ 

+  00  otherwise 

,  a2  «  1 

■  sup  {(p^.X^  -  fj^CXj)}  * 
X1 


+  09 

if 

Pj  <  1 

2<prl) 

> 

if 

Pi  i  1 

sup  {<p2, 
X2 

V 

-  f2(x2)} 

0 

if 

P2  <  2 

3(p2-2) 

if 

P2i2  • 

and  A2 

are 

R  ,  so 

It  is  easy  to  compute 


sup  (p.-l)x 


sup  (p--2) 
0<*2<3 


conditions  (2.2) 
g  in  this  case: 


We  can  see  that  (i)  if  y  <  1  then  gCy)  is  infinite;  (ii)  g  is 
not  differentiable  at  the  point  y  =  2;  (iii)  the  optimal  solution 
of  the  problem  (2.5)  is  y*  =  2,  the  corresponding  subproblem 

sup  (y*-l)x. 

*1<2 

has  a  unique  optimal  solution  x^(y*)  *  2  but  the  subproblem 

sup  (y*-2)x_ 

0<X2<3 

has  an  infinite  number  of  optimal  solutions.  If  we  use  the  simplex 
method  to  solve  it,  we  have  X2(y*)  =  0  or  3  .  Neither  (2,0)  nor 
(2,3)  is  the  optimal  solution  of  the  original  problem,  which  is  (2,2). 

All  of  the  drawbacks  above  will  disappear  if  the  functions  f^ 
are  strongly  convex. 

Definition.  A  function  f  defined  on  a  convex  S  c  Rn  is  said  to 
be  strongly  convex  with  modulus  a  >  0  if 

f((l-A)x+Ay)  <  (l-A)f(x)  +  Af(y)  -  \  aA(  1-A)  || x-y ||  ^ 
for  all  x,y  e  S  and  0  <  A  <  1  . 

We  have  the  following  nice  properties  in  the  case  that  f^  are 
strongly  convex. 

Theorem  2.2 

If  the  f^  are  strongly  convex  then  the  ft  are  finite  every¬ 
where  and  are  Lipschitz  continuously  differentiable.  Hence  g  is 
also  finite  everywhere  and  is  Lipschitz  continuously  differentiable. 
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The  derivative  g1  of  g  at  the  point  y  is  given  by 

g*(y)  -  a  -  l  A.(x.(y)) 
i-1  1  1 

where  x^(y)  is  the  optimal  solution  of  (2.6)  for  i  *  1,2,... m. 

If  y*  is  the  optimal  solution  of  (2.5)  then  xAy*)  are  optimal 
solutions  fbr  the  original  problem  (2.1). 

(For  a  proof  see  Robinson  [19783). 

Remark.  In  this  case,  the  conditions  (2.2)  are  satisfied  trivially 
because  the  ft  are  finite  everywhere. 

Strong  convexity  is  a  very  attractive  property,  but  unfortu¬ 
nately,  in  applications  there  are  several  important  cases,  in  which 
the  objective  functions  are  convex  but  not  strongly  convex.  For  ex¬ 
ample,  a  linear  functional  is  convex  but  not  strongly  convex;  so  is 
a  positive  semidefinite  quadratic  functional.  A  question  arises: 
given  a  convex  function  can  we  somehow  "strongly  convexify"  it,  so 
that  we  can  apply  the  decomposition  technique  in  a  straightforward 
manner?  The  answer  is  yes.  It  can  be  done  by  using  the  proximal 
point  method,  but  at  the  expense  that  we  have  to  solve  a  sequence 
of  problems  instead  of  just  one. 

2.2  The  Proximal  Point  Method 

The  notion  of  the  proximal  point  was  introduced  by  Moreau  [1965]. 
A  convergence  proof  of  the  proximal  point  method  was  given  by  Martinet 
[1970,  1972]y Rockafellar  [1976  a,  b]  and  Brezis  and  Lions  [1978]  gen- 


f-  »v»7«v  /• 


eralized  the  results  and  gave  the  rate  of  convergence  in  two  slightly 
different  versions.  Rockafellar  applied  it  to  various  types  of  math¬ 
ematical  programming  problems.  (The  proximal  point  is  also  called  the 
resolvent  method. ) 

We  consider  a  general  convex  programming  problem 

min  f(x)  (2.7) 

x  e  C 


where  f  is  a  convex  function  defined  on  Bn,  having  values  in  B. 
and  C  is  a  nonempty  closed  convex  set  in  Bn.  Let  be  the  in¬ 
dicator  function  of  the  set  C,  i.e.. 


Vx) 


0 

+  00 

k 


if  x  e  C 
otherwise  . 


Define  F(x):  *  f(x)  +  i|)  (x);  then  F  is  a  proper  closed  convex 

C 

function  and  (2.7)  can  be  rewritten  as 

min  F(x) 

_n 

xe.TR 

t 

The  subdifferential  3F(x)  of  F  at  x  is  defined  by 

3F(x):  -  {teBn|F(z)  >F(x) +<  t,z-x>  ?zeBn)  . 

It  is  easy  to  see  from  the  definition  of  3F  that  x*  is  an  optimal 
solution  of  (2.7)  iff  0  e  3F(x*).  Therefore,  the  minimization 
problem  (2.7)  can  be  solved  by  finding  a  solution  of  a  generalized 
equation  0  e  3F(x).  We  now  study  the  problem  of  finding  a  solution 
for  the  generalized  equation 
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where  T  is  a  set-valued  mapping.  Let  D(T)  be  the  domain  of  T, 
i.e.  D(T)  is  the  set  of  x  such  that  T(x)  j*  <P  .  T  is  said  to 
be  a  monotone  operator  if 

(x-x',y-y'>  >0  Vy  e  T(x),  Vy'  €  T(x'),  Vx,x'  €  D(T)  . 

T  is  said  to  be  a  maximal  monotone  operator  if  it  is  monotone  and 
its  graph 

G(T) :  -  {(x,y) (yeT(x)} 

is  not  properly  contained  in  the  graph  of  any  other  monotone  operator. 

If  F  is  a  proper  closed  convex  function,  then  the  subdiffer¬ 
ential  3F  is  a  maximal  monotone  operator  (see  CBrezis  1973]).  Given 
a  positive  number  X,  the  resolvent  of  T  is  defined  by 

Jx(y):  -  (I+XT)"l(y)  . 

is  a  single-valued  function  and  it  is  a  contraction,  i.e., 

HJx(yi)~Vy2)N  -  Hyry2N  for  aI1  yi  *nd  y2  • 

(see  also  [Bre'zis  1973]).  We  have  an  important  relation 

0  €  T(x)  if  and  only  if  x  ■  J^(x)  •  (2.8) 

The  minimization  problem  (2.7)  has  thus  been  transformed  to  the  prob¬ 
lem  of  finding  a  fixed  point  of  the  resolvent  of  the  subdiffer¬ 

ential  3F  ■  3(f+iJ<j,)  . 

The  proximal  point  algorithm  generates  for  any  starting  point 


0  k 

x  a  sequence  {x  }  obtained  by  the  relation 


k+1  .  ,  kv 

x  ■  J.  (x  ) 

\ 


(2.9) 


where  {A^}  i*  *  sequence  of  positive  numbers  with  A  £  A  >  0  V  k 


k+1 

In  the  case  T  ■  3P,  x  is  the  optimal  solution  of 


■in  F(x)  ♦  yj-  ||x-xk|j2  , 


which  is  equivalent  to 


min  f(x)  ♦  yv-  ||x-xk||2  . 
xeC 


(2.10) 


In  practice,  it  is  impossible  to  obtain  the  true  optisMl  solution  of 
(2.10),  so  we  would  like  to  be  able  to  choose  xk+^  as  a  point  near 
the  optimal  solution.  We  shall  use  the  following  approximation 
criterion 

ll*k+1-  ^(*“>11  <  \  (2.11) 

where  (e^}  is  a  sequence  of  positive  numbers  such  that  J  <  •  . 

k**l 

Theorem  23  ([Rockafellar ,  1976a]) 

Let  {x  }  be  any  sequence  generated  by  the  proximal  point  al- 

t. 

gorithm  under  the  criterion  (2.11).  Suppose  (x  }  is  bounded.  Then 
{x  }  converges  to  a  point  x*  satisfying  0  c  T(x*)  and 

lim  ||xk+1  —  xk  ||  -  0  . 
k*» 
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Remark 

(i)  The  Mctiury  and  sufficient  condition  for  the  boundedness  of 
k 

{x  }  is  the  existence  of  a  solution  for  0  e  T(x)  ( [ Rocke¬ 
feller,  1976a  )  (i.e.  {xk}  is  bounded  if  and  only  if  (2.7)  has 
optimal  solutions). 

(ii)  Under  the  condition  that  T  2  is  Lipschits  continuous  at  0, 

|g 

the  sequence  {x  }  converges  to  x*  linearly.  In  addition, 

if  X^  ♦  •  ,  the  convergence  is  super linear. 

If  there  exists  x  such  that  0  c  int  T(x),  then,  with  the 
k+1  k 

exact  form  x  ■  J^x  ,  the  convergence  is  finite.  This  is 
also  true  in  the  case  of  linear  programing  problems. 

Application  of  the  proxisml  point  algorithm  to  linear  programming  probl 
Consider  a  linear  programming  problem 

min  (c,x) 
subject  to  Ax  ■  b 
x  >,  0  . 

Suppose  the  problem  has  an  optimal  solution.  Let  X  be  a  positive 

number.  He  have 

ll*-*k|I2  •  jjx— (xk— Xc)  II  2  -  y  II C II  2  ♦  <C,Xk>  . 

Thus  the  problem 

min  <c,x>  ♦  ^  ||x-xk||2 
subject  to  Ax  ■  b 

x  >  0 
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is  equivalent  to 

min  ||  x  -  (xk-Ac)||2 
subject  to  Ax  ■  b 

*  >  0  , 

k+1  .  k 

so  x  is  the  projection  of  x  -  Xc  onto  the  feasible  region. 

The  proximal  point  algorithm  in  this  case  is  the  same  as  the  gradient 

projection  method  of  Levitin  and  Polyak  [1966]  or  as  one  of  the 

methods  of  feasible  directions  of  Zoutendijk  [1976]  (but  not  as  the 

k  k+1 

gradient  projection  method  of  Rosen  [I960],  since  x  and  x  do 
not  need  to  be  on  the  same  face  of  the  feasible  polyhedron,  as  in 
Rosen's  method). 

K 

The  figure  below  shows  the  sequence  (x  }  in  an  example. 


Figure  2.2  An  example  of  the  proximal  point  algorithm. 


2.3  General  algorithm 

The  decomposition  technique  and  the  proximal  point  method  will 
be  combined  to  give  a  new  algorithm  for  problems  having  the  form 
(2.1).  For  convenience,  we  rewrite  (2.1)  here. 


min  l  f.(x.) 
i-1  1  1 


(2.1) 


subject  to  I  A.x.  ■  a 

i-1  1  1 


Let  V.i  «  dom  f.,  f(x):  -  £  f.(x.),  and 

1  x  i-i  X  1 

C:  -  {x  -  (xj.Xj,  . . .  ,xffl)  |  AjXj  ♦  ♦•••+  A^x^  «  a}  . 

m 

Suppose  (2.1)  is  feasible,  i.e.  C  n  IT  V.  t  $  .  We  apply  the 

i-i  1 

proximal  point  algorithm  to  solve  (2.1).  First  we  choose  a  sequence 
of  positive  numbers  {A^},  which  is  bounded  away  from  zero;  and  a 
starting  point  iP  -  (x?,  x|?,...,x^),  which  is  not  necessarily 

12  ID 

12  k 

feasible.  Suppose  we  have  generated  k  points  x  ,x  ,...,x  ;  then 


x  will  be  the  unique  solution  of  the  following  problem: 

min  ^  ♦  2 Hxi“Xi^2) 

i-1  x 


(2.12) 


subject  to  £  A.x.  -  a  . 
i-1  1  X 

If  ||xk+1-xk||  <  e  ,  then  xk+*  is  an  optimal  solution  for  the 

It  k+1 

original  problem  (2.1).  Otherwise,  replace  x  by  x  in  (2.12) 
and  repeat  the  procedure. 


The  problem  (2.12)  above  has  a  strongly  convex  objective  func¬ 
tion,  so  we  can  apply  the  decomposition  technique  in  a  straightforward 
manner  to  solve  it.  That  means  transforming  it  to  the  dual  problem 
(2.5)  which  is 

sup  g(y) 

7 

where  the  function  g,  defined  by  (2.4),  in  this  case  is 


g(y)  *<a,y>  +  £  min  (f.(x.)+-rr—  || x.  — x*f Jj 2  -<A?y,x.))  .  (2.13) 

i-1  x_.  1  1  ZAk  1  1  11 


The  derivative  g'(y)  is  given  by 


(y)  -  a  -  l  A.x. 


i-1 


1  x 


(2.14) 


where  x^  is  the  optimal  solution  of 

min  (f.(x.)+-rr—  l|x.-x^||2  -  <A?y,x.))  (2.15) 

x£  1  1  2Ak  1  1  11 

for  i  *  1,2 . m  . 

Note  that  (i)  g(y)  is  finite  for  every  y,  so  that  the  dual  problem 
(2.5)  is  an  unconstrained  maximization  problem,  (ii)  Once  we  com¬ 
pute  m  subproblems  (2.15)  to  obtain  the  value  of  g(y),  we  have 
the  derivative  g'(y)  almost  for  free.  Hence  we  can  use  any  gra¬ 
dient-type  algorithm  for  unconstrained  maximization  problems  to 
solve  (2.5).  In  a  schematic  way,  we  have 
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Algorithm 
Step  0 

Step  1 


Step  2 

Comments 

(i)  The  fact  that  the  starting  point  need  not  be  feasible 

is  very  useful.  In  applications,  usually  by  investigating 
the  real  situation  from  which  the  problem  arises,  we  may  have 

a  guess  which  is  infeasible  but  close  to  the  optimal  solution. 

0  12 

(ii)  Even  if  x  is  infeasible,  the  points  x  ,x  ,...  are  feasible 

1  2 

and  f(x  )  >  f(x  )  _>  ...  .  Hence,  we  have  upper  bounds  which 
get  better  and  better  at  each  iteration. 

(iii)  We  assumed  that  the  problem  (2.1)  is  feasible,  so  that 

F(x)  ■  f(x)  +  »l>c(x) 


Choose  a  starting  point  x^  ■  (x?,x!?, . . .  ,x^) ,  set  k  *  0  . 

i  4  m 


1.0  Choose  a  point  y^;  set  j  ■  0  . 

1.1  Corresponding  to  y^  solve  m  minimization  prob¬ 
lems  (2.15).  Let  x.  be  the  optimal  solutions. 

1.2  Compute  g(y.)  and  g'(y.)  as  in  (2.13)  and  (2.14). 

3  J 

1.3  If  || g* (y^ ) ||  <  €  ,  go  to  step  2;  otherwise,  use  some 

gradient-type  algorithm  to  find  a  new  point  y^+^,  set 
j  “  j  ♦  8°  hack  to  step  1.1. 


If  llx-x^H  <  c  ,  x  is  an  optimal  solution;  otherwise, 
k+1  - 

set  x  *  x  and  k  *  k  +  1,  then  go  back  to  step  1. 
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■*  '«•> 


is  a  proper  closed  convex  function  and  dF  is  a  maximal 
monotone  operator.  Consequently,  the  proximal  point  method 
goes  through.  Suppose  we  do  not  know  whether  the  problem  is 
feasible  or  not;  can  we  apply  the  algorithm  to  solve  the 
problem?  If  the  problem  turns  out  to  be  infeasible,  what  will 
tell  us  that? 

The  algorithm  can  be  used  without  knowing  the  feasibility 
of  the  problem  in  advance.  If  the  problem  is  infeasible  then 
the  problem  (2.5)  is  unbounded.  This  is  to  say  that  if  the 
primal  problem  is  infeasible  then  the  feasible  dual  is  unbounded. 

The  general  algorithm  will  be  used  to  solve  block-angular 
linear  programming  problems  in  the  following  section.  We  also 
use  the  general  algorithm  to  solve  nonlinear,  convex  structural 
engineering  problems.  The  results  are  reported  in  [Kaneko  and 


Ha,  1980]  „ 


3.1  The  Problem 


A  block  angular  linear  programming  problem  it  a  linear  programming 
problem  having  the  form 


min  +  <  c,  ,x»>+...  +  (  c  ,x  ) 

i  1  2  2  m  hi 

subject  to 

-  d. 


D1X1 


D2*2 


(3.1) 


D  x  ■  d 
m  m  hi 


*1*1  *  Vj  ♦  V.  ■ 

*1  —  >  o 

i  n  ™ 


°i  n.  m.  m_ 

where  x.  c  R  ,  c.  €  R  ,  d.  e  R  1  .  a  e  R  u  .  A.  is  an 
x  i  i  ’ 

t  x  n^  matrix,  and  is  an  nu  x  n^  matrix  for  i  ■  1,2 . m  . 

An  example  of  a  real  world  problem  that  has  the  special  form 
(3.1)  is  the  problem  of  a  multidivisional  organization.  Each  division 
operates  with  considerable  autonomy;  it  has  its  own  internal  resources 
for  production,  e.g.,  labor  and  machines.  That  accounts  for  the  con¬ 
straints  -<*i  x*l,...,m.  The  divisions  are  coupled  by  the 

fact  that  there  are  shared  resources  which  all  of  the  divisions  use, 
for  example  a  raw  material  of  limited  availability.  That  gives  rise 


co  the  coupling  constraints 


m 


lA 

i-1 


.x. 
X  X 


a  . 


Approaches  to  solving  (3.1)  can  be  roughly  divided  into  two 
categories:  improvements  of  the  simplex  method  and  decomposition 
techniques.  In  the  simplex  method  the  main  computational  difficulties 
are  the  updating  of  the  inverse  of  the  basis.  Because  the  problem  (3.1) 
has  a  special  structure,  there  are  several  ways  to  reduce  the  computa¬ 
tional  effort  and/or  the  storage  requirements  of  each  simplex  itera¬ 
tion.  The  principal  improvements  are  generalized  upper  bounding  tech¬ 
niques  ([Dantzig  and  Van  Slyke,  1967]),  basis  factorization  ([Winkler, 
1974]),  and  LU  decomposition  ([Bartels  and  Golub,  1969]  and  [Forrest 
and  Tomlin,  1972]).  The  well-known  paper  of  Dantzig  and  Wolfe  [1960] 
was  the  first  paper  to  use  a  decomposition  technique  to  solve  (3.1). 
The  idea  of  the  Dantzig-Wolfe  decomposition  is  appealing  but  computa¬ 
tional  experiences  are  erratic  ([Lasdon,  1978]  and  [Adler  and  Ulkucii, 
1973]).  Recently  there  have  been  several  attempts  to  improve  the  com¬ 
putational  aspects  of  the  Dantzig-Wolfe  decomposition,  such  as  the 
work  of  Ten  Kate  [1972],  or  the  boxstep  algorithm  of  Marsten  et  al. 
[1975].  Up  to  the  present,  there  are  no  reports  on  how  those  algo¬ 
rithms  compare  with  each  other. 


The  Algorithm 


We  are  going  to  apply  the  general  algorithm  of  Section  2  to  solve 


(3.1).  First  we  need  to  rewrite  (3.1)  in  the  form  (2.1). 


n. 

Define  V. :  »  {x.  eS  1  Id.x.  *  d,  x.  >  0}  for  i  "  1,2, ...  ,m  . 

l  i  'll  1=“ 

Suppose  V £  T4  4>  for  all  i  (if  there  exists  an  i  such  that 

«  4>  ,  then  the  problem  (3.1)  is  infeasible).  Let 

f.(x.):  *  <c.,x.)  +  (x.)  for  i  -  l,2,...m  then  f.  are 

11  11  l/.  1  1 

1 

proper  closed  convex  functions.  The  problem  (3.1)  now  has  the  form 

min  l  f.(x.) 
i-1  *  1 

m 

subject  to  [  A,x.  ■  a  , 
i-1  1  X 

which  is  the  same  form  as  (2.1).  Therefore,  we  can  apply  the  general 
algorithms  to  solve  it.  The  subproblems  (2.15),  in  this  case,  are 
quadratic  programming  problems 


min  (f-UJ+y^-  Hxi“xil|2~<Aiy’xi>) 

i 

min  ((  c,  ,x.  >  +i|u  (x.)  +  -tt—  ||x.  -  x^  ||2  -(  ATy,x.>  ) 
i  l  V.  l  2a,  "  i  l  "  i  i 

x^  i  k 

min  (-^-  ||x.-x*|l2+<c.-A*y,x.>) 


x.  k 

i 


(3.2). 


subject  to  D.x.  -  d. 

ill 

x.  >  0  . 

i  — 

The  function  g,  defined  by  (2.4),  can  be  proved  to  be  a  piecewise 
quadratic  function.  We  shall  prove  and  use  that  fact  later.  For  the 
moment,  we  use  a  general  nonlinear  algorithm  to  solve  the  dual  prob¬ 
lem  (2.5)  which  is 
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with 


sup  g(y) 
yc]R 


m 


g(y)  *  <  a,y )  -  f  min  (yr-llx.  -  x^||  2  +  <  c.  -  ATy,x.  >)  . 

i-1  x.eV.  2Ak  1  1  iix 

l  l 

The  overall  procedure  can  be  summarized  as  follows: 


Algorithm  I 


Step  0  Choose  a  starting  point  xu  .  Set  k  -  0 


Step  1 


1.0  Choose  a  point  y ^  .  Set  j  *  0  . 

1.1  For  given  j  and  y^  solve  (3.2).  for  j  »  l,2,...,m. 
Let  be  the  optimal  solution  and  z^  be  the  optimal 

objective  value. 


1.2  Compute  g(y 


.)  -  (  a,y. )  -  l  z. 

*  J  i-1  1 

a 


g'(y  )  -  a  -  J  A.x. 


i-1 

1.3  If  y^  is  a  maximizer  of  g  ,  go  to  Step  2;  otherwise, 
use  some  gradient-type  algorithm  to  find  a  new  point 
y.+^,  set  j  -  j  +  1  and  go  back  to  Step  1.1. 


Step  2  If  llx-x^H  <  e  ,  accept  x  as  (nearly)  an  optimal  solution 
for  (3.1);  otherwise,  set  xk+*  -  x  and  k  -  k  ♦  1  ,  then  go 
back  to  Step  1. 
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Notes 


k 

(i)  The  iteration  on  x  will  be  called  the  outer  loop  and 
that  on  yj  will  be  called  che  inner  loop. 

(ii)  As  mentioned  in  Section  2,  if  the  problem  (3.1)  has  optimal 
solutions  then  the  outer  loop  is  a  finite  process,  i.e. 
is  an  optimal  solution  for  some  k  . 

(iii)  The  fact  that  the  starting  point  x^  can  be  any  point  is 

clearly  an  advantage  of  our  algorithm  over  algorithms  based 
on  the  simplex  method.  For  the  simplex-type  methods,  the 
starting  point  needs  to  be  not  only  a  feasible  point  but 
also  an  extreme  point  of  the  feasible  region,  and  we  also 
need  to  know  the  inverse  of  the  corresponding  basis. 

For  a  linear  programming  problem,  the  dual  variables  play  an 
important  role.  They  have  economic  interpretations  and  they  are  use¬ 
ful  in  sensitivity  analysis.  If  we  solve  a  linear  programming  problem 
by  our  algorithm,  can  we  obtain  the  optimal  dual  variables?.  The 
answer  is  yes.  We  have  the  following  proposition. 

Proposition  3.1 

Consider  a  linear  programming  problem 

min  (c,w) 

subject  to  Aw  ■  b  (3.3) 

w  >  0  . 
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Suppose  it  has  an  optimal  solution  v  .  Then  the  quadratic  program- 
ming  problem 

min  <c,w>  +  I|w-w||2 

subject  to  Aw  ■  b 
w  >  0 

« 

has  the  Karush-Kuhn-Tucker  point  ([Mangasarian,  19691)  (w*,v*), 
where  w*  ■  w  and  v*  is  an  optimal  dual  variable  for  (3.3). 


Proof 

The  proof  of  the  identity  w*  “  w  is  given  by  relation  (2.8) 
of  Section  2.  (3.4)  is  equivalent  to 


min 

subject  to 


(c-j^w,w)  +  t^(w,w) 
Aw  ■  b 
w  >  0  . 


Hence  (w*,v*)  has  to  satisfy  the  following  relations 

c-iw+^w*-  ATv*  _>  0 

w*  >  0  Aw*  ■  b 
(w*,(c-|w  +  |w*_aV))  “  0 


Since  w*  ■  w  ,  those  relations  reo 


to 


I  o 


A( »  ^ 


/*  #- 


which  are  the  complementary  slackness  conditions  £or  (3.3).  Thus  v* 
is  an  optimal  dual  variable  £or  (3.3).  □ 


3.3  Implementation  and  Computational  Results 

We  now  consider  the  details  o£  how  we  solved  the  dual  problem 
(2.5)  and  the  quadratic  programing  problems  (3.2)^.  Recall  that  g(y) 
is  a  Lipschitz  continuously  differentiable  function,  so  to  solve  (2.5) 
we  can  use  any  gradient-type  unconstrained  minimization  algorithm.  We 
think  that  the  best  current  algorithm  is  that  of  Broyden-Fletcher- 
Golfard-Shanno  (BFGS).  We  used  the  BFGS  package  of  the  Harwell  Sub¬ 
routine  Library  available  at  the  Madison  Academic  Computing  Center 
and  known  as  VA13A  ([MACC,  19763). 

The  core  of  our  algorithm  is  the  solution  of  small  quadratic  pro¬ 
gramming  problems  (3.2)^;  the  number  of  quadratic  programming  problems 
solved  in  a  problem  may  be  several  hundreds.  For  that  reason,  we  need 
an  efficient  quadratic  programming  algorithm  which  can  take  the  optimal 
solutions  of  the  problems  of  the  previous  iteration  as  the  starting 
point  for  the  problems  of  the  current  iteration.  Furthermore,  we  want 
to  exploit  the  fact  that  the  matrices  in  the  objective  functions  are 
the  identity  matrices  times  a  constant.  First,  we  transform  problems 
(3.2)^  into  quadratic  programming  problems  having  only  nonnegativity 
constraints.  Problems  (3.2)^  have  the  following  form,  except  for  con¬ 
stant  terms 

min  (c,u)  +  ^(u,u) 

subject  to  5u  «  d  <3. 5) 

u  >  0 


where,  for  a  fixed  i  ,  D  *  D.,  d  •  d.,  u 


Xi’ 


.T  lk 
c.  -  A.y  -  t x< 
1  i'  A  1 


and 
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Using  the  orthogonal  projector  P  onto  the  null  space  of  D 
and  the  ordinary  dual  of  a  quadratic  programming  problem,  we  can  show  that 
(3.5)  is  equivalent  to 

-  --  <  v, Pv  >  +  (  APc  -  u  ,v  >  (3.6) 

v  >  0  2 

S 

where  u  is  an  arbitrary  solution  for  Du  =  d  . 

The  orthogonal  projector  P  and  the  solution  u  can  be  computed 
easily  by  using  the  Moore-Penrose  generalized  inverse  and  the  QR 
decomposition  of  D  (for  more  detail  see  [Ha,  1980]).  Note  that,  for 
a  fixed  i  ,  5  and  d  do  not  change  at  all,  so  we  need  to  compute 
P  and  u  only  once. 


There  are  several  quadratic  programming  algorithms  that  can  be 
used  to  solve  (3.6).  There  are  two  packages  available  at  the  Madison 
Academic  Computing  Center  of  the  University  of  Wisconsin-Madison, 
namely  QUADPR  and  LCPL,  but  with  these  packages  we  cannot  use  the  op¬ 
timal  solutions  of  the  previous  iteration  as  the  starting  points  for 
the  current  iteration.  The  Best-Ritter  algorithm  ([Best  and  Ritter, 
19763)  allows  us  to  do  that,  so  we  use  it  to  solve  (3.6). 

Ve  wrote  a  computer  program  to  test  the  algorithm,  using  Fortran  V 
on  the  UNIVAC  1110  of  the  Madison  Academic  Computing  Center  of  the 
University  of  Wisconsin-Madison.  In  the  program  we  set  the  stopping 
criteria  as  follows. 

_A 

(i)  Outer  loop  (Step  2).  We  set  ■  10  .  If 

ll*k*1-»k||  <  e 
ll*kn-*k||  <  ll*k|| 


then 

lem. 


k+1 


x 


is  considered  to  be  an  optimal  solution  for  the  prob- 
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(ii)  Inner  loop  (Step  1.3).  We  act  E^  *  10  ^  .  The  atoppinf  cri¬ 
terion  in  thia  ceae  ia  that  of  VA13A.  That  means  a  solution  ia 
accepted  if  a  relative  change  of  aize  £2  in  the  components  of 
j  doea  not  reduce  the  objective  value. 

tfe  generated  teat  problema  by  predetermining  the  aize  of  the 
problem,  the  aize  of  blocka  and  the  number  of  coupling  conatrainta, 
and  then  generating  randomly  data  of  the  problem.  The  matricea  A. 
and  are  90Z  dense.  Each  entry  of  those  matrices  ia  a  pseudo¬ 

random  number  in  the  range  £-50,  50]  (obtained  by  the  random  number 
routines  of  the  Madison  Academic  Computing  Center  [MACC,  1978]).  The 
coat  coefficients  are  pseudo-random  numbers  in  the  range  [-10,  10]. 

To  be  sure  that  the  problem  ia  feasible  we  randomly  generated  a  se¬ 
quence  of  integers  ia  between  0  and  5  (considered  as  a  feasible 
point)  and  then  multiplied  them  with  and  A^  to  get  the  co¬ 

efficients  of  the  right  hand  side.  The  numbers  of  variables  of  teat 
problems  and  other  information  are  shown  in  Table  3.1. 


Problenh'ss^ 

Problem  Size 

Number  of 
Blocks 

Number  of  Coupling 
Constraints 

I 

50  x  100 

10 

5 

II 

70  x  95 

3 

10 

III 

100  x  200 

20 

5 

IV 

500  x  700 

20 

10 

Table  3.1  Test  Problems  Statistics 
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We  compared  our  algorithm  with  two  linear  programming 
packages  available  at  the  Madison  Academic  Computing  Center  of  the 
University  of  Wisconsin-Madison,  namely  SIMP LX  and  FMPS-LP  ([MACC, 

1977  and  1978b3).  SIMP LX  uses  the  two-phase,  revised  simplex  method 
with  the  inverse  of  the  basis  matrix  stored  explicitly.  FMPS-LP  is  a 
part  of  the  UNIVAC  Functional  Mathematical  Programming  System.  It  uses 
variants  of  the  revised  simplex  method  with  only  nonzero  elements  of 
the  constraint  matrix  stored  explicitly,  and  the  inverse  stored  in  the 
product  form.  In  all  cases,  our  algorithm  gave  the  same  optimal  solu¬ 
tions  (up  to  five  significant  figures)  as  the  linear  programming  pack¬ 
ages.  We  kept  X  unchanged  from  one  iteration  to  the  next  of  the  outer 
loop,  but  we  ran  the  problems  with  different  \  .  Computational  results 
are  shown  in  the  following  tables.  CPU  time  includes  time  to  collect 
all  relocatable  elements  and  to  produce  an  executable  absolute  element, 
and  time  for  input/output.  It  is  measured  in  seconds.  The  starting 
point  x°  and  the  point  yA  were  taken  to  be  the  origin. 


\ 

Problem 

I 

11 

111 

lvu> 

1 

104.0 

48.5 

219.2 

10 

23.7 

68.8(2) 

34.5 

724.5 

20 

19.2 

99.5 

34.1 

30 

15.1 

80.0 

27.4 

40 

13.2 

100.0 

24.9 

50 

12.6 

117.5 

47.7 

60 

10.5 

101.5 

23.8 

70 

13.3 

126.0 

24.5 

80 

10.5 

105.5 

20.5 

90 

10.2 

170.8 

43.5 

100 

9.7 

102.5 

28.6 

Table  3.2  CPU  Time  (secs)  of  Test  Problems 
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Caspars  to  CPU  tine  of  the  linear  programing  packages. 


I 

II 

III 

IV 

SIMPLX 

32.5 

82.3 

257.8 

(3) 

FMPS-LP 

6.5 

14.7 

24.8 

3617. 0<4) 

Table  3.3  CPU  Time  (secs)  of  Linear  Programming  Packages 

(1)  Because  of  the  size  of  the  problem  and  of  the  limitation  of  our 
budget,  we  decided  to  run  only  one  run  with  X  *  10. 

(2)  There  is  noise  for  X  >  10,  i.e.  it  came  near  the  optimal  solu¬ 
tion  and  then  moved  around  that  point  erratically. 

(3)  SIMPLX  cannot  handle  a  problem  of  that  size  (500*  700). 

(4)  The  problem  needed  5  runs  to  reach  the  optimal  solution,  so  the 
CPU  time  included  time  for  putting  data  and  current  tableaus  on 
a  file,  and  retrieving  them.  The  actual  time  of  solving  the  prob¬ 
lem  should  be  less  than  3617.0  seconds,  but  certainly  it  is  much 
longer  than  724.5  seconds,  the  time  that  our  algorithm  took  to 
solve  the  problem. 

From  these  tables  we  have  several  observations. 

(1)  Comparing  to  FMPS-LP,  our  algorithm  gets  better  when  the  size  of 
the  problem  increases.  For  Problems  I  and  II,  none  of  the  runs 
of  our  algorithm  is  faster  than  FMPS-LP;  for  Problem  III  (size 
100*  200),  there  are  several  X  with  which  our  algorithm  is 
faster  than  FMPS-LP;  for  Problem  IV  our  algorithm  is  clearly 


-32- 


far  superior  to  FMPS-LP.  Certainly  the  regular  SIKPLX  is  not 
comparable  to  our  algorithm. 

(2)  Our  algorithm  is  not  very  good  in  the  case  of  Problem  II,  which 
has  3  big  blocks  (the  sizes  of  blocks  are  15*  25,  20*  30,  and 
25*40).  That  is  expected.  The  core  of  our  algorithm  is  solving 
quadratic  programming  problems  (3.2)  again  and  again.  If  we  have, 
say  10  smaller  subproblems  instead  of  3  subproblems  of  the  above 
sizes,  we  would  have  a  much  better  time,  since  the  computational 
time  tends  to  grow  polynomially  with  respect  to  the  size  of  the 
problem. 

(3)  The  values  of  X  ,  which  give  the  best  computational  times,  vary 
with  problems:  they  are  100,  1  and  80  for  problems  I,  II  and  III 
respectively.  In  theory,  the  number  of  iterations  of  the  outer 
loop  decreases  as  X  increases  (in  fact  we  can  prove  that  if  we 
use  a  X  sufficiently  big,  we  could  reach  an  optimal  solution  in 
one  iteration  of  the  outer  loop).  But  by  taking  X  too  large, 
we  may  have  numerical  difficulties  and  may  not  be  able  to  solve 
the  problem  at  all.  Fewer  iterations  of  the  outer  loop  does  not 
mean  less  computational  time  as  it  can  be  seen  on  Table  3.2.  The 
other  drawback  of  a  large  X  is  the  loss  of  accuracy  caused  by 
numerical  errors.  A  natural  alternative  seems  to  be:  taking  X 
initially  large  then  reducing  it  gradually.  But  that  idea  did 
not  work  well.  We  have  run  our  test  problems  with  different 
schemes  of  changing  X  from  one  iteration  to  the  next  and  we 


found  that  the  computational  results  are  very  erratic.  That  is 
understandable:  since  we  have  not  been  able  to  choose  X  opti¬ 
mally  for  a  given  problem,  we  should  not  expect  to  know  how  to 
change  X  iteratively  to  get  a  better  computational  time.  We 
feel  that  the  question  of  choosing  and/or  changing  X  itera¬ 
tively  can  only  be  answered  after  an  extensive  use  of  our  algo¬ 
rithm.  For  the  moment  we  suggest  to  use  X  fixed  with  a  value 
in  the  range  from  10  to  50;  if  numerical  difficulties  are  en¬ 
countered  reduce  X  . 

As  mentioned  earlier  one  of  the  advantages  of  our  algorithm  is 
that  the  starting  point  x^  can  be  anywhere.  Suppose  we  know  approx¬ 
imately  where  the  optimal  solution  should  be;  then  the  computation 
time  should  be  better.  We  ran  our  test  problems  with  the  starting 
points  taken  to  be  the  known  optimal  solutions,  rounded  to  the  nearest 
integer.  The  results,  which  are  better  as  would  be  expected,  are 
shown  in  the  following  table. 

(X-  10) 

Table  3.4  CPU  Time  of  Test  Problems  with  Different  Starting  Points 
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3.4  A  Variant  of  Algorithm  I  (Method  of  feasible  directions) 

We  know  that,  if  the  problem  (3.1)  has  optimal  solutions,  then 

the  proximal  point  algorithm  (outer  loop)  will  generate  a  finite 

sequence  of  feasible  points  {x  ,x  ,...,x  }  such  that  f(xJ  )  <  f(xJ) 

k 

for  j  ■  1,2, . . . ,k -  1,  and  x  is  an  optimal  solution  for  (3.1).  But 
if  the  problem  is  unbounded,  then  the  sequence  {xk}  is  infinite  and 
ll*k||  ®  **  k  ■+  »  .  For  practical  purposes  that  is  undesirable; 
we  want  to  have  a  simpler  criterion  for  the  case  of  unboundedness.  By 
exploiting  the  linearity  of  the  problem  we  can  modify  the  proximal 
point  method  so  that  it  either  obtains  the  optimal  solution  or  detects 
the  unboundedness  in  a  finite  number  of  iterations.  We  set  X  fixed 
and  replace  Step  2  of  Algorithm  I  by 


Step  2' 


If  P-= 

itk||  <  e 

otherwise,  for 

o  ■  min  • 

f  (xk). 

i 

(x  -x) . 

k 


accept  x  as  an  optimal  solution  for  (3.1); 
■  1  set  a  ■  1  ,  for  k  >  1  compute 


for  indices  i  such  that 


(x-x) .>  0 

l 


.(3.7) 


k  * 

If  (x  -x)^  <  0  for  all  i  ,  the  problem  is  unbounded. 
k+1  k  —  k 

Otherwise,  set  x  ■  x  +  a(x-x  )  and  k  *  k  +  1  ,  then 
go  back  to  Step  1 


Proposition  3.2 

The  modified  algorithm  with  Step  2'  above  either  detects  the  un¬ 


boundedness  of  the  problem  or  finds  an  optimal  solution  in  a  finite 
number  of  iterations  of  the  outer  loop. 


Proof 


For  k  >  1  x  is  feasible.  By  the  interpretation  of  the  prox- 

•  •  ,  .  lc 

imal  point  method  given  in  Chapter  2,  x  is  the  projection  of  x  -  Xc 

onto  the  feasible  region.  If  we  consider  Xc  as  the  cost  coeffi¬ 
cients  instead  of  c  ,  then  x  is  the  projection  of  the  negative  of 
k, 

the  gradient  at  x  onto  the  feasible  region.  a  given  by  (3.7)  is 
the  maximum  stepsize.  Therefore,  the  outer  loop  procedure  (Step  2') 
is  a  method  of  feasible  directions  applied  to  linear  programming  prob¬ 
lems.  The  finiteness  of  the  method  is  proved  by  Zoutendijk  C1976]. 

We  used  the  modified  algorithm  to  solve  our  test  Problems  I  and 
III;  ve  had  the  following  results. 


XV 

I 

III 

1 

35.8(1) 

81.4 

10 

25.9 

45.9 

20 

20.0 

30.5 

30 

16.5 

33.8 

40 

12.3 

29.9 

50 

14.2 

104.3(2) 

60 

16.4 

46.3 

70 

18.1 

38.6 

80 

10.1 

19.9 

90 

12.8 

39.4 

100 

12.9 

45.1 

Table  3.5  CPU  Time  of  Test  Problems  Using  the  Modified  Algorithms 


(1)  For  all  X  ,  except  X  a  80,  it  came  near  the  optimal  solution 
then  moved  around  that  point  erratically. 

(2)  It  came  near  the  optimal  solution  in  22  seconds  then  went  away 
again. 

We  also  used  the  modified  algorithm  to  solve  an  unbounded  prob¬ 
lem.  The  problem  is  of  size  50  *  90,  with  7  b.ocks  and  10  coupling 
constraints. 


Modified  Algorithm  With  X = 


10 

20 

30 

40 

50 

60 

70 

SIMP LX 

47. 6 

41.8 

33.6 

36.2 

35.2 

29.6 

(1) 

44.7 

FMPS-LP 


11.0 


Table  3.6  CPU  Time  for  An  Unbounded  Problem 

(1)  Numerical  errors. 

The  modified  algorithm  has  the  advantage  of  detecting  the  un¬ 
boundedness,  but  it  is  not  very  stable  when  getting  near  the  optimal 

k  - 

solution.  The  reason  is  that  near  optimum  x  and  x  are  very  close 

•  Vc  . 

together,  so  numerical  errors  in  x  may  give  much  bigger  errors  for 
f  k* 

^i  k+1  k  -  k 

the  quotient  — r — — —  ;  consequently  x  =  x  +  a(x-x  )  may  be 

(x  -x). 

i 

k 

a  point  farther  from  the  optimal  solution  than  x  .  Hence  we  rec¬ 
ommend  using  Algorithm  I  for  problems  for  which  the  existence  of  opti¬ 
mal  solutions  is  known  beforehand.  Otherwise,  we  suggest  using  the 

modified  algorithm,  but  switching  back  to  Algorithm  I  when  the  dis- 

k 

tance  between  x  and  x  is  small. 
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3.5  A  Specific  Algorithm  for  the  Dual  Problem 


Algorithm  I  is  a  straightforward  application  of  the  general  algo¬ 
rithm  in  Section  2  to  the  block  angular  linear  programming  problem 
(3.1).  We  have  not  taken  advantage  of  the  linearity  of  the  problem; 
in  particular  the  algorithm  used  to  solve  the  dual  problem  sup  g(y) 
is  suitable  for  general  nonlinear  optimization  problems.  Geoffrion 
[1970b] observes  that  the  dual  of  a  quadratic  programming  problem  taken 
with  respect  to  a  subset  of  the  constraints  is  a  piecewise  quadratic 
function.  That  is  exactly  our  case  and  we  are  going  to  give  a  proof 
of  the  above  observation.  We  consider  the  problem 


min  <c,x>  +4<x,Cx> 


subject  to  Dx  “  d 


Ax  ■  a 


x  >  0  . 


(3.8) 


Suppose  C  is  symmetric  and  positive  definite  and  at  every  point  of 
the  set  {x:>0|Dx  =  d}  ,  the  gradients  of  the  active  constraints  are 
linearly  independent.  As  before,  for  a  given  y  we  define  g(y) 
to  be  the  objective  value  of  the  problem 

min  <c,x>  +  x,Cx)  +  <  y,a  -  Ax> 
subject  to  Dx  ■  d  (3.9) 

x  >  0  . 
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Proposition  3.3 

g(y)  is  a  piecewise  quadratic  function. 

Proof 

For  a  fixed  y^  ,  let  x(yQ)  be  the  optimal  solution  of  (3.9) 
corresponding  to  y^  and  define  the  set  of  indices  M  by 

M:-  {i|x(yQ)^-  0}  . 

Consider  the  problem 

min  <c,x>  +  -j(x,Cx>  +(y,a-Ax> 
subject  to  Dx  *  d  (3.10) 

x^  ■  0  for  i  €  M  . 

The  problem  above  is  feasible  since  x(y q)  satisfies  the  constraints 

Let  x(M,y)  be  the  optimal  solution  and  v(M,y)  be  the  multipliers. 

Let  J  be  the  index  set  of  v(M,y)  corresponding  to  the  constraints 

x.  ■  0  for  i  €  M  . 
x 

Let  K  be  the  matrix  representing  these  constraints;  that  is  K  is 
a  matrix  of  |M|  rows,  with  the  k-th  row  having  an  entry  of  1  in 
the  column  corresponding  to  the  k-th  element  of  M  and  zero  in  all 
other  entries.  Define 


~D“ 

"d” 

and  d : * 

_K_ 

_0_ 

then  the  constraints  of  (3.10)  can  be  rewritten  as 

A  A 

Dx  -  d 
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and  the  Karush-Kuhn-Tucker  conditions  for  (3.10)  are 


(3.11) 


Under  our  assumptions  the  matrix  above  is  invertible;  consequently 
x(M,y)  and  v(M,y)  are  affine  functions  of  y  . 

x(M,y)  and  v(M,y),  being  the  optimal  solution  and  the  Lagrange 
multipliers  of  (3.10),  need  not  be  the  optimal  solution  and  the  La¬ 
grange  multipliers  of  (3.9).  But  if 


x(M,y)  >  0  and  v.(M,y)  >  0  for  j  e  J  , 

then,  by  setting  all  of  the  Lagrange  multipliers  corresponding  to  the 
constraints  x.  >  0  i  i  M  equal  to  zero,  we  can  verify  easily  that 
x(M,y)  and  v(M,y)  satisfy  the  Karush-Kuhn-Tucker  conditions  for 
(3.9).  Hence  x(M,y)  in  this  case  is  the  optimal  solution  for  (3.9). 
Conversely,  if  x(M,y)  and  v(M,y)  are  the  optimal  solution  and  the 
Lagrange  multipliers  of  (3.9),  then  certainly  we  have  x(M,y)  _>  0  and 
v.(M,y)  >  0  for  j  e  J  .  The  set 

Qm:*  {y|x(M,y)>0  and  v.(M,y)>0  for  j  e  J}  (3.12) 

n  —  2 
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is  a  polyhedron.  On  each  such  polyhedron  g(y)  is  a  quadratic  func¬ 
tion.  Hence  g(y)  is  piecewise  quadratic. 

Using  the  above  proposition  we  have  another  algorithm  to  solve 
(3.1). 

Algorithm  II 

Step  0  Choose  any  point  as  the  starting  point,  set  k  *  0  . 

Step  1 

•  k 

1.0  For  given  k  and  x  ,  choose  y^  and  set  j  =  0  . 

1.1  Solve 

min  NXi  “  xi II 2  +  *  ci  ”  Ai?j  *xi  > 

s.t.  DjX.  -  dj^  (3.13) 

x.  >  0 

i  ” 

for  i  ■  1,2, ... ,m  . 

Let  x^  be  the  optimal  solution  and  v^  be  the  corres¬ 
ponding  multipliers.  Let  x  ■  (x. ,x. . x  )  . 

I  m 

1.2  Determine  the  index  set  M  =  (h/(x).  *  0}  and  the  corres- 

n 

ponding  polyhedron  Qu  defined  by  (3.12).  Solve  the 
quadratic  programming  problem 

sup  g(y)  .  (3.14) 

yeQM 


If  it  is  unbounded,  then  the  original  problem  (3.1)  is  in¬ 
feasible.  Otherwise,  let  y  be  the  optimal  solution. 


1.3  If  ||g* (y) It  <  £  go  to  Step  2.  Otherwise  compute 


yj+l  *  y  +  a8,(y) 

where  a  is  a  real  number  such  that  g(y^+^)  >  g(y)  • 

Set  j  ■  j  +  1  and  go  back  to  Step  1.1. 

Step  2  Compute  x(M,y)  by  (3.11).  If  ||x(M,y)  -  x^||  <  £  ,  x(M,y) 

k+1  — 

is  an  optimal  solution  for  (3.1).  Otherwise  set  x  *  x(M,y) 
and  k  ■  k  +  1  ,  then  go  back  to  Step  1. 


Remark 

The  partition  of  the  space  of  y  into  polyhedra  is  finite  and  in 
the  process  of  finding  the  optimal  solution  of  the  dual  problem  a  poly¬ 
hedron  is  never  repeated.  This  is  true  because  in  Step  1.3  we  require 
g(y^+j)  >  g(y)  and  y  is  the  maximizer  of  g(y)  on  the  polyhedron 
containing  y^  ,  so  that  polyhedron  is  never  repeated.  Therefore,  the 
inner  loop  is  a  finite  procedure. 

We  now  explain  in  detail  how  to  determine  the  polyhedron  Qu  in 

n 

Step  1.2.  Recall  that  Q  is  given  by 

QM  ■  {ylx(H,y)  >0  and  v.(M,y)^0  j  €  J} 
n  J 

where  x(M,y)  and  v(M,y)  solve  the  system  of  linear  equations  (3.11). 
We  have 


c 

-Df 

-1 

>1  „-lsT,-  -l-T.-l-  -1 

C  -CD  (DC  D  )  DC 

o-isT.--i-T.-ri 
C  D  (DC  D  ) 

A 

D 

0_ 

m 

~  -l-T.-l-  -1 
-(DC  D  )  *DC 

>-» 

• 

1 _ 

-42- 


In  our  case  C  ■  A  I  so  ve  can  simplify  the  above  expression  to 
obtain 


I/X  -  D* 

-1 

X(I  -  dt(ddt)“1d) 

DT<DDTrr 

6  0_ 

at 

-(66t)-16 

i  (6BX)“l 

pn  .1 

I/X  -  D* 

r-l 

1 

1 

n  | 

LyJ  1 

1 

o 

<Qj 

d 

x(M,y)  «  X(l-DT(DDT)“1B)(ATy- c)  +  6T(DDT)-1d 
v(M,y)  «  -(DDT)_1D(ATy- c)  +  i(DDT)“1d 
Let  P  be  the  orthogonal  projector  onto  the  null  space  of  D  and 

A+  /S 

P  be  the  generalized  inverse  of  D  .  We  know  that 

A  AAf  .1  A 

P  -  I  -  D  (DP  ) 


and 


D+ 


6T(S£T>-1 


I 


SO 

x(M,y)  *  XP(ATy-c)  +  D+d 


and 

v(H,y)  -  ~(D+)T(ATy  -  c)  ♦  j (DDT)-1d  . 


(3.1 


Because  x(M,y)  is  the  solution  of  (3.11),  x^(M,y)  ■  0  for 

all  i  €  &  .  Hence  (X,  can  be  expressed  explicitly  as 


.  ;V  ^ 


Qjj  *  {y|  (AP^(A^y  -  c)  +  D+d)^  £  0  ,  i  {  M  and 

(-(D+)T(ATy-c)  +-|(DDT)_1d)j  >0,  j  e  j} 

Using  the  expression  (3.15)  of  x(M,y)  we  have  an  explicit  expression 
for  g(y),  y  «  QM  • 

g(y)  *^<x(M,y),  x(M,y)>  +  <c,x(M,y)>  +  (  y,'a  -  Ax(M,y)  > 

»-^<PAy,PAy>  +  <  y,a  +  XAPc  -  AD  d> 

+  <c,D  d-APc>  +  ^-(D  d-APc,D  d-APc>  . 

After  deleting  constants  in  the  objective  function,  we  see  that  the 
problem  (3.14)  is  equivalent  to 

sup  - (  PA  y,PA  y)  +  (  y ,  a  +  XAPc  -  AD  d  ) 

subject  to  (AP(A^y  -  c)  +  D  d)^  _>  0  i  €  M 

(-(D+)T(ATy-c)  +  i(DDT)"1d)j  >0  j  e  J  . 

In  our  original  problem  (3.1),  the  matrix  D  has  the  block  angular 
form 

D  - 

The  matrix  K  representing  constraints  x^  ■  0  also  has  the  same 


-44- 


Hence  the  block  diagonal  structure  is  preserved  when  we  compute  D 

A 

and  P  . 


In  updating  y  from  to  y^+^,  w  go  from  one  polyhedron 

A 

to  another  and  we  have  to  update  D  and  P  .  Usually  the  set  of 
active  constraints  changes  by  just  a  few  indices;  consequently,  we 
have  to  update  only  a  few  small  matrices.  By  using  the  QR  composition 

A^  A  0 

of  D1fD-,...D  that  we  already  have,  the  updating  of  D  and  P  is 
12  IB 

much  more  efficient  ([Gill  et  al.,  1974]). 

We  wrote  a  computer  program  to  test  the  Algorithm  II;  again  we 
used  our  test  Problems  1  and  III.  He  had  the  following  results 


X 

I 

III 

10 

9.0 

19.0 

20 

5.8 

17.3 

30 

5.0 

17.4 

40 

4.9 

16.7 

50 

4.2 

16.3 

60 

4.3 

15.1 

70 

4.2 

15.8 

80 

4.3 

15.0 

90 

3.9 

14.4 

100 

4.0 

(1) 

Table  3.7  CPU  Time  Using  the  Algorithm  II 


-46- 


(1)  Numerical  errors. 


We  used  the  new  system  at  MACC,  the  UNIVAC  1100/80,  so  these 
numbers  are  not  comparable  to  the  numbers  of  the  Tables  3.2  and  3.3 
which  ve  obtained  by  using  the  old  system,  the  UNIVAC  1110.  The  new 
system  is  approximately  2.5  times  faster  than  the  old  one,  so  to  com¬ 
pare  the  algorithms  ve  need  to  multiply  the  numbers  of  Table  3.7  by 
a  factor  of  2.5.  Then  we  can  see  that,  for  the  Problem  I,  the  Algo¬ 


rithm  II  is  a  little  bit  better  then  the  Algorithm  I,  but,  for  the 
Problem  III,  the  Algorithm  II  is  not  as  good  as  the  Algorithm  I. 

We  think  that  the  Algorithm  II  can  be  improved  computationally. 
For  the  moment,  in  finding  the  maximizer  for  the  piecewise  quadratic 
function  g(y)  we  have  not  been  able  to  use  the  optimal  solution  for 
g(y)  in  one  piece  (polyhedron  (3.12))  as  a  starting  point  for  the 


problem  in  the  next  piece  (problem  3.14))  (because,  by  choosing  y.+^ 
as  in  Step  1.3,  we  are  not  sure  that  the  polyhedron  containing  y^+^ 
is  adjacent  to  that  of  y^).  Each  time  we  solved  the  problem  (3.14) 
we  had  to  start  from  scratch,  i.e.,  we  had  to  use  the  simplex  algorithm 


to  find  the  starting  point.  That  really  slowed  down  the  whole  process. 
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